Document Text Extraction from Document Images Using Haar Discrete Wavelet Transform
نویسندگان
چکیده
This paper presents an efficient and computationally fast method to extract text regions from documents. In this paper, we propose Haar discrete wavelet transform (DWT)[9] which operates the fastest among all wavelets because its coefficients are either 1 or -1. This is one of the reasons we employ Haar DWT to detect edges of candidate text regions. First, we detect edges and then line feature vector graph is generated based on the edge map and the stroke information is extracted. Finally text regions are generated and filtered according to line features. Experimental results show that, without increasing the computational cost, our proposed method could suppress the false alarms notably. Furthermore, our method can be easily customized for applications with different tradeoffs in recall and precision.
منابع مشابه
Modified Method of Document Text Extraction from Document Images Using Haar DWT
This paper extends the technique used for Document Text Extraction from Images using 2-D Haar Wavelet. The discrete wavelet transform is a very useful tool for signal analysis and image processing, especially in multi-resolution representation. It can decompose signal into different components in the frequency domain. Two-dimensional discrete wavelet transform (2-D DWT) decomposes an input imag...
متن کاملText Extraction of Vehicle Number Plate and Document Images Using Discrete Wavelet Transform in MATLAB
Text Extraction from colour images is a challenging task in computer vision. The concept of text extraction is derived from the vehicle plate recognization and their characters extractions individually. Some examples of the applications are automatic image indexing, visual impaired people assistance or optical character reading, keyword searching in a document image. The continuous research has...
متن کاملImage Segmentation for Text Extraction
This paper presents a methodology for extracting text from images such as document images, scene images etc. Text that appears in these images contains important and useful information. Text extraction in images has been used in large variety of applications such as mobile robot navigation, document retrieving, object identification, vehicle license plate detection, etc. In this paper, we emplo...
متن کاملA Novel Method for Efficient Text Extraction from Real Time Images with Diversified Background using Haar Discrete Wavelet Transform and K-Means Clustering
The proposed system highlights a novel approach of extracting a text from image using two dimensional Haar Discrete Wavelet Transformation and K-Means Clustering. As the commercial usage of digital contents are on rise, the requirement of an efficient and error free indexing text along with text localization and extraction is of high importance. Majority of the previous research work on text ex...
متن کاملGlobal Approach for Script Identification using Wavelet Packet Based Features
In a multi script environment, an archive of documents having the text regions printed in different scripts is in practice. For automatic processing of such documents through Optical Character Recognition (OCR), it is necessary to identify different script regions of the document. In this paper, a novel texture-based approach is presented to identify the script type of the collection of documen...
متن کامل